The question we chose was to see if urbanization indicated a higher fertility rate.
3/29/2023
The question we chose was to see if urbanization indicated a higher fertility rate.
Understanding the relationship between urbanization and fertility rates can assist in determining the demand for resources and services in both urban and rural areas.
This information can be used by governments and organizations to allocate resources more effectively for things such as healthcare, education, housing, and infrastructure.
High fertility rates can strain resources and services, while rapid urban growth may lead to challenges in providing adequate infrastructure, housing, and job opportunities.
Identifying these relationships can help inform strategies to promote sustainable development and economic growth.
We obtained our data from the United Nations at data.un.org
We used two separate datasets, one labeled “Population in the capital city, urban, and rural areas” and “Population growth, fertility, life expectancy, and mortality”
There were several variables included in these two data sets; however, the ones we were interested in were urban percentage, urban percentage growth, rural percentage growth, and fertility rate.
Urban percentage measures the percentage of the country living in urban areas (cities)
Urban percentage growth measures the percentage growth of the urban areas per year tracked over a 5-year period preceding the reference year.
Rural percentage growth measures the percentage growth of rural areas per year tracked over a 5-year period preceding the reference year.
Urban population percentage is a proxy for the degree of urbanization, reflecting the proportion of the population living in urban areas.
Urban growth percentage and rural growth percentage are proxies for the rate at which urban and rural populations are changing, respectively, indicating the dynamics of urbanization over time.
The variables of interest were under a “series” column so we used pivot_wider from tidyr to create separate columns and ensure the data was tidy.
Within each data set we dropped irrelevant data, and renamed the columns.
Our fertility dataset had the reference year of 2022. Since our urban data set did not have this year, we used left_join when combining the datasets which drops the year within our finished dataset.
We then used the countrycode so we could analyze the data pertaining to continents and subregions.
sidebarPanel from:
observeEvent can check for changes to the continent filter so that sub-regions are shown when only one continent is selectedIn the Urban data set, one of the reference years is 2017 while one of the reference years was 2018 in the fertility data set. Since the 2018 data is valid for the previous 4 years, we decided to use the 2017 reference period as congruent to the 2018 reference period.
For weaknesses in the data itself, UN data is often based on information provided by individual countries, and the quality of this information can vary significantly depending on each country’s data collection and reporting practices.
To populate the available input years, a summary of the number of observations in each year that contained non-NA values for fertility rate and at least one of urban percentage and urban percentage growth.
| Year | Valid (Both Plots) | Valid (Urban % Plot Only) | No Fertility Data | No Fertility / Urban Growth Data |
|---|---|---|---|---|
| 2001 | NA | NA | NA | 1 |
| 2005 | NA | NA | 232 | NA |
| 2010 | 225 | NA | 7 | NA |
| 2015 | 225 | NA | 7 | NA |
| 2018 | NA | 182 | NA | 50 |
Valid years can then be calculated as a sum of the valid columns:
valid_yrs = validity$year[!is.na(validity$valid) | !is.na(validity$urb_valid_only)] valid_yrs_both_plots = validity$year[!is.na(validity$valid)]
Slider input ranges were set manually by buffering the min and max values for each respective input to round numbers.
The initial view of the app includes all but a few upper outliers in urban growth percentage.
Filtering to show just Africa and Europe demonstrates the relationship between fertility rate and growing urban populations that aren’t already a large portion of a country’s population (i.e. high growth but low urban percentage).
However, the patterns visualized could also just be confirming prior knowledge of developing countries.
Correlation may be present, but the data doesn’t necessarily show direct causation